Many say that "The Cloud" will be the next game-changing computing platform, and the race is on to define and capture that domain. Historically, new platforms take off when independent developers start to make innovative use of platform-specific features. In the case of Cloud Computing, that means exploiting distributed systems in a datacenter. But there is as yet no widely-used programming model that lets a developer easily coordinate the distributed power of a datacenter.
The BOOM Project at Berkeley aims to change this, via data-centric programming paradigms and tools, grounded in the implementation of significant cloud systems. As a first step, we developed BOOM Analytics: an API-compliant reimplementation of Hadoop and HDFS written in the Overlog declarative language. Developed in a relatively short nine-month design cycle, our Overlog interpreter and Hadoop implementation perform as well as the standard Java-only implementation, with a compact and easily-extendible codebase. Within that timeframe we extended BOOM Analytics with new features not yet available in Hadoop, including Paxos-driven high availability, parallel scale-out of master nodes, and intrinsic monitoring and debugging facilities.
Taking lessons from that experience, we are in the midst of designing Bloom, a new data-centric paradigm for programming distributed systems. The foundation for Bloom is a temporal logic called Dedalus, which captures traditionally imperative issues like state modification and asynchrony in a fully declarative, model-theoretic framework.
This talk will overview our ideas for data-centric programming, reflect on our experience building and extending BOOM Analytics in Overlog, and talk about some of our future plans for language design and cloud system development.